Information theoretic cixioms for Quantum Theory 
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In this paper we derive the complex Hilbert space formalism of quantum theory from four simple 
information theoretic axioms. It is shown that quantum theory is the only non classical probabilistic 
theory satisfying the following axioms: distinguishability, conservation, reversibility, composition. 
The new results of this reconstruction compared to other reconstructions by other authors are: (i) we 
get rid of axiom "subspace" in favor of axiom conservation eliminating mathematical requirements 
contained in previous axiomatics; (ii) we are able to classify all the probabilistic theories that are 
consistent requiring (a) only the first two axioms (b) only the first three axioms; this could be 
useful in experimental tests of quantum theory since it gives the possibility to understand whether 
or not other mathematical models could be consistent with such tests; (iii) we provide a connection 
between two different approaches to quantum foundations, quantum logic and the one based on 
information theoretic primitives showing that any theory satisfying the first two axioms given above 
either is classical or is a theory in which physical systems are described by a projective geometry. 



One of the most curious facts about quantum theory 
is that it explains almost every physical phenomenon ex- 
cept from gravity and it is still not clear why it works 
so well. The first attempts to give rigorous foundations 
for the rules of quantum theory initiated a subject called 
quantum logic . The starting point in quantum logic 
is that propositions related to measurements performable 
on a quantum system can be associated to sentences of 
a prepositional calculus. When the system is classical 
the propositions related to classical measurements form 
a Boolean algebra and Boolean algebras are the alge- 
braic models of the calculus of classical logic. The main 
question that quantum logic adresses is: when a Boolean 
algebra is relaxed into an orthomodular nondistributive 
lattice (i.e. a generalization of the lattice of subspaees of 
a hilbert space), which logic is it the model of? "Quan- 
tum Logic" is the name that designates the answer but 
there are several views about the content of this name 
and its physical significance Other notable and con- 
ceptually simpler ways to look for a physical explanation 
of the rules of quantum theory is to consider hidden vari- 
ables models [p^-p^. The starting point to formulate 
these models is that the state of a quantum system de- 
scribes an ontic property of the system and the random- 
ness in quantum experiments is simply a consequence of 
mediating the result on many repetitions of the exper- 
iment. These models are appealing because they give 
simple explanations to questions regarding the onthol- 
ogy of the state of a quantum system or the nature of 
the measurement process. In such models a physical sys- 
tem infact always possesses a definite value of a physical 
quantity and a measurement is just a read of this value. 
One of the main drawback to consider this as the physical 
explanation of quantum phenomena is that hidden vari- 
ables in these models cannot be Lorentz invariant and, at 
the same time, be used in a model equivalent to quantum 
theory from the predictive point of view [Fsl . The advent 
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of quantum information theory opened a new direction in 
the research of foundations of quantum theory. Quantum 
information showed that the controversial physical phe- 
nomena arising from the mathematics of quantum the- 
ory, can be exploited for information theoretic tasks that 
are impossible in a world governed by classical physics 
p^-p7| . It is then natural to ask wether it is possible to 
put as foundations of the mathematical rules of quantum 
theory a set of information theoretic principles. The first 
work that partially answered to this question in a posi- 
tive way was [ pO) . Almost ten years after the appearence 
of that paper, finding informational principles for quan- 
tum theory is becoming a quite active area of research 
[ll |l]-il Hi- Based on the reconstruction of in 
[23 1 it is given an argument to eliminate the use of one of 
the axioms in the first reconstruction. Simplicity. In , 
the argument developed in is used to derive a new set 
of requirements that, if imposed to a probabilistic theory, 
are equivalent to the mathematical formalism of quantum 
theory. In it is considered purification as a founda- 
tional principle at the base of all the new information 
theoretic features of quantum information. It is shown 
that all theories satisfying purification (namely all states 
of a system A are in one to one correspondence with pure 
states defined on a larger composite system AB) and local 
discriminability (that is shown to be equivalent to local 
tomography) are very similar to quantum theory from 
an informational point of view. Starting from this work 
the same authors gave in pH | a reconstruction of quan- 
tum theory from six informational axioms, two of which 
are purification and local discriminability. The author 
of [ pO[ , ten years after his first reconstruction, invented 
a formulation of quantum theory based on a new math- 
ematical object called Duotensor and gave a new set of 
operational/informational principles that are proved to 
be equivalent to this formulation of quantum theory . 

In this paper we give a new reconstruction of quantum 
theory based on a new set of informational principles. 
The main result is the following: the only non classical 
probabilistic theory satisfying a list of four axioms - Dis- 



2 



tinguishability, Conservation, Reversibility, Compositon 
- is quantum theory. The new resuhs of this reconstruc- 
tion compared to other reconstructions by other authors 
are: (i) we get rid of the subspace axiom (see 23 ) 



and of any other different formulation of it (see the com- 
pression axiom in |^) (iia) we show that every prob- 
abilistic theory in which axioms distinguishability and 
conservation hold is either classical or a probabilistic the- 
ory in which pure states are points of a projective space 
defined over a generic field of numbers; this is a gener- 
alization of quantum theory in which the superposition 
principle holds with amplitudes not necessarily complex 
but belonging to a generic field of numbers (lib) we show 
that if in (iia) is required axiom reversibility then the 
elementary system must be a hypersphere in generic d 
dimension; among these theories we find generalizations 
of quantum theory in which complex numbers can be 
substituted by normed real division algebras (i.e. Re- 
als, Complex, Quaternions and Octonions) (iic) we argue 
that quantum theory over reals is consistent with all ax- 
ioms a part from local tomography of axiom composition 
(see below) that must be subsituted with 2-local tomo- 
grahy (see |^). These classifications could be useful to 
check whether the predictions of a given experimental 
test of quantum theory are in principle consistent with 
other probabilistic theories or not. 

The paper is organized as follows. In section ^ it is 
presented the framework of probabilistic theories from 
which it is reconstructed quantum theory and in section 
H they are presented the axioms. With section [IIA it 
begins the derivation. Our derivation is based on the fact 
that pure states of a quantum system with dimension n 
are points of a complex projective space of dimension 
n - 1 (denoted as CF"-^). A point of a CP""^ is the 
equivalence class of vectors in C" constructed considering 
two vectors for which holds the following relation 

(Zi,Z2...,Z„) = (AZi,AZ2...,AZ„) AeC (1) 

equivalent. To have an example, the set of pure states 
of a qubit is the set of points of a complex projective 
space of dimension 1 (i.e. CP^). This manifold is indeed 
isomorphic to the ordinary sphere in M.^ and is usually 
referred to as the Bloch sphere. Projective spaces over a 
generic field of numbers K can be defined in the same way 
given above for the co mplex case but with the field IK in 
place of C. In section III A we show that if a probabilis- 



tic theory satisfies Distinguishability, Conservation then 
either is classical or it is a theory in which pure states are 
points of a projective space over a generic field of num- 
bers K. Among these theories we find quantum theory 
but also quantum theory over real numbers and quantum 
theory over quaternions. In section [II C it is shown that 



the only non classical theory in this class satisfying ax- 
ioms reversibility and composition to describe composite 
systems is quantum theory. The last two sections are 
devoted to discussion and conclusion. 



I. PROBABILISTIC THEORIES 

A generic probabilistc theory is a mathematical frame- 
work in which it is possible to model any experimental set 
up and to calculate probabilities for all the possible con- 
figurations of a set up. In such a framework preparation, 
transformation and measurement devices are represented 
by collections of outcomes, e.g.: 

P = {Pi}iex M = {aj}j<zY T = {■%}k(iZ 

where p, M, T arc respectively a preparation, transfor- 
mation and measurement while 0; , , aj represent the 
corresponding outcomes in some outcome sets X, Z, Y. 

An outcome set of a physical device, in general, is not 
something that have a well defined probability distribu- 
tion on its own. The probability distribution of the out- 
come set of a measurement device in an experimental 
setup, depends on the settings of the preparation device. 
This is clear since if we perform a measurement on a 
system prepared in some way we have a given probabil- 
ity distribution of the measurement outcomes, while, if 
we perform the same measurement on a system prepared 
in some different way we obtain, in general, a different 
probability distribution. 

Given a preparation and a measurement, p = 
, M = {aj}j,zY we define their composition as: 

o : iaj,p,) aj op, V(i, j) eX xY (2) 

Since to a preparation followed by a measurement out- 
come is always associated the probability of the measure- 
ment outcome given that preparation we can associate to 
aj o pj the probability of seeing measurement outcome aj 
when performing measurement {aj}j^Y on a system pre- 
pared in state pi, namely: 



p: aj o Pi ^ p{aj\p.,) 



(3) 



To use a notation that resembles the braket notation of 
standard quantum theory we will also define: 



(ajlpi) :=p(ai|pj) 



(4) 



Composing the two maps o and p we can define a new 
map M turning any pair (aiPj) into a probability: 



M : iaj,pi) ^ (ajlpi) 



(5) 



The probabilistic structure defined by (^) turns every pi 
into a function from measurement outcomes to probabil- 
ities, given by M[(-,pi)]. If pi, p^ induce the same func- 
tion, then it is impossible to distinguish between them 
from the statistics of an experiment. This means that 
the two outcomes of the preparation device are equiva- 
lent: accordingly, we will take equivalence classes with 
respect to this equivalence relation. We will thus iden- 
tify the outcomes with the corresponding function 
Pi and will call it state. Accordingly, we will refer to 
preparation devices as collections of states {/Oijigx- The 
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same construction holds for measurements: every mea- 
surement outcome aj induces a function from prepara- 
tions to probabilities, given hy M[{aj, ■)]. If two outomes 
Uj , a'j induce the same function, then it is impossible to 
distinguish between them from the statistics of an exper- 
iment. This means that the two outcomes are equivalent: 
accordingly, we will take equivalence classes with respect 
to this equivalence relation. We will thus identify the 
outcome aj with the corresponding function and we will 
call it effect. Accordingly, we will refer to measurement 
devices as collection of effects {ajjigy 

The state of a system provides the information regard- 
ing the probabilities of all the possible outcomes in all 
the possible measurements that can be performed on a 
system prepared in a given configuration. Thus the state 
of a system can be represented by a list of probabilities 
that, in principle, could contain an infinite number of 
entries. Hence for a state p of a system we can write: 



P = 



( : \ 

Pa, 



(6) 



where Pa^ is the probability of the effect labeled by oj 
given the state p. p is thus represented by the vector (g) 
and two different state vectors of a given system will differ 
at least for one of their entries. In principle the number 
of entries for a state expressed as in (H) is infinite since, 
typically, there is a continuum of possible measurements 
associated to a system. For example the spin of a particle 
can be measured in all directions in space and thus we 
have a continuum of possible measurements for the spin. 
However in quantum theory a restricted subset of entries 
suffices to specify a state as in (^. For a qubit, the num- 
ber of such entries is 4, i.e. the number of real parameters 
that are necessary in order to specify a hermitian matrix 
acting on C^. In it is argued this kind of compres- 
sion is a general feature of all physical theories and it is 
taken as a starting point to formulate a framework for 
quantum gravity. In this paper we will assume that the 
number of entries in the list of probabilities representing 
a state of a physical system as in (||) is finite. 

From the above considerations we have that states can 
be represented as elements of a real finite dimensional 
vector space while effects can be represented in the vec- 
tor space constituting the dual of the state space. As a 
consequence transformations of a system A can be repre- 
sented as operators defined on the vector space in which 
are represented the states of A. 

Definition 1 The dimension of the real vector space 
where states of a system are represented as vectors of 
probabilities is the dimension of the system. 

For a quantum system seen as an object of a probabilistic 
theory the dimension of the hilbert space associated to a 
system does not coincide with the dimension of the sys- 
tem. The dimension of a quantum system in this frame- 
work is the dimension of the real vector space in which 



are defined density matrices representing states. This 
is the real vector space generated by hermitian operators 
acting on the hilbert space associated to the system. The 
dimension of such real vector space is d = where C" is 
the hilbert space associated to the system. This notion 
of dimension of a system is introduced in 
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The probability distribution of the effects A = {aj}j 
in measurement A always depends on what state is pre- 
pared. It then must hold that the probability of happen- 
ing of state Pi must be independent of the measurement 
that is performed. If it were not so then the probability 
distribution of the effects in a measurement would de- 
pend on the state of the phsyical system and, at the same 
time, would be something on which the state of the sys- 
tem depends. If this was the case then states and effects 
would be related in a non linear way and the function as- 
sociating pairs of state and effect to a probability would 
be non linear. This would prevent the possibility to form 
mixture of states (see below) and of representing states 
in a real vector space. Since the probability of happen- 
ing of state Pi must be independent of the measurement 
performed we must have: 



ij\Pi 



) = iJ2bi\p^) y {a,},eY,{bi}ies (7) 



where J2jeY % ' represents respectively a single- 

ton (i.e. a single outcome measurement) constituted by 
the coarse graining of all the outcomes in measurements 
j}j£Y and B = {6;};es. Condition (|^) is stated 
causality. We now give the 



in |25|| and is associated to 
following: 



Definition 2 Given any measurement M = {mj}j^Y, 
the coarse graining of the effects in M , Ujgy mj, is called 
deterministic effect. 

We now state the following characterization of (^ 

Proposition 1 ^ is eguivalent to reguire that the de- 
terministic effect is unigue for all measurements. 

This proposition is proved in pst . 

Given (^ we must have that the probability distribu- 
tion of the states of a preparation device p = {pi}i^x 
must be independent of the settings and outcomes on 
other devices in any experiment. This implies that to ev- 
ery preparation p = {pi}i^x must be associated a prob- 
ability distribution {pi}i^x for the corresponding states. 
Hence every state p can be represented as a convex com- 
bination of other states, namely: 



P 



^PtPi 



(8) 



iex 



From (g) and the fact that states can be represented as 
in (^ we have that the state space of any physical system 
is a compact convex set. This leads us immediately to 
the following definition: 
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Definition 3 A state is mixed if it can be represented as 
a convex combination of at least two other states. A state 
is pure if the convex combination representing it contains 
only one state. 

Since we have that iJ^-p = J2i£X Pi-^^Pi for every trans- 
formation and every state p we must have that trans- 
formations are hnear operators. Note however that the 
convex decomposition of a state p is not required to be 
unique. There can be many ways to express the same 
state p as a convex combination of other states. 

Definition 4 The refinement of state p is defined to be 
the set of states that can appear in a convex decomposi- 
tion representing p. 

We will denote the refinement set of a state p as Fp. 
The notion of refinement of a state will play a crucial 
role in this derivation. It is introduced in the framework 
of probabilistic theories in ||2l|. From the mathematical 
point of view, the refinement of a state coincides with 
the notion of face of a convex set. A face F of a convex 
set, S, is defined to be a convex subset of S such that 
given two points, xi, X2 £ S" if Xxi -I- (1 — X)x2 € F then 
xi,X2 S F. From definition ^, we see that the refinement 
of a state p is a face of the (convex) state space of the 
system of which p represents a preparation. 

Definition 5 A state p is completely mixed relatively to 
a set of states S , if all the states in S can appear in the 
convex decomposition of p. 

A physical system can be used to store information. 
Storage of information into a physical system is possible 
if it can be defined a protocol that can be used to read 
that information. To define such a protocol we give the 
following: 

Definition 6 The set of states {pi}fLi is perfectly dis- 
tinguishable if there exists a measurement A = {aj}jLi 
such that {aj\pi) = 5ij. The set of effects {aj}^^-^ is a 
set of perfectly discriminating effects. 

Given the above definition we see that if the set of 
states of a system contains at least two perfectly distin- 
guishable states, the system can be used to store infor- 
mation. Referring to the above definition, if one prepares 
a state belonging to a set of perfectly distinguishable 
states, {pi}fLi, say pi^, then performing the measure- 
ment A = {aj}jLi one can say with certainty that the 
state prepared was pi„ upon seeing the effect aig. This 
in turn defines a protocol to read the information that 
can be stored in the physical system. A list of L symbols 
choosen among the set {i}iLi can be stored in L copies of 
the system with preparations choosen in the set of states 

In Quantum theory a system having states defined on 
a Hilbert space of dimension n, C" has n perfectly dis- 
tinguishable states. 

We conclude this section with the following two defi- 
nitions 



Definition 7 The set of perfectly distinguishable states 
{Pi}iLi G S, where S is a generic set of states, is maxi- 
mal in S if there does not exist a state a in S such that 
{Pi}iLi U (T is a perfectly distinguishable set of states. 

Definition 8 The information capacity of a set of states 
is the cardinality of the largest maximal set of perfectly 
distinguishable states in that set. 

II. AXIOMS FOR QUANTUM THEORY 

In what follows we will impose a set of four axioms 
that are very natural for a generic probabilistic theory in 
which systems can be used to store information. It will be 
shown in the following sections that the only non classical 
theory satisfying all these axioms is quantum theory. The 
axioms are called Distinguishability, Conservation, Re- 
versibility, Composition. The starting point to formulate 
them is that every physical system can be used to store 
information. To understand the meaning of this consider 
a system as simple as a die. On every face of a die there 
is a certain number of dots from one to six. Suppose 
that one has to store and retrieve a text written using an 
alphabet of six elements {a,b, ..,/}. He can encode ev- 
ery letter of the alphabet into a number from one to six. 
The text is an ensemble of letters that have a certain 
probability of appearing {pa,Pb, This text can 

thus be stored into an ensemble of dice that are prepared 
according to the probability distribution {pa , pf, , . . , p/ } . 
The preparation of the ensemble for storage of the text 
can be performed leaning each die in the ensemble on a 
surface in such a way that the face corresponding to the 
stored letter is hided. One can retriveve the text sim- 
ply looking at which is the hided face for every die. The 
above protocol to store a text into an ensemble of sys- 
tems can be accomplished with a quantum system with 
hilbert space dimension six as well. One can choose a ba- 
sis for the system, {li)}-"^! and encode every letter of the 
alphabet into one of these states. Storage of the the text 
into an ensemble of physical systems can be performed 
preparing each quantum system in the ensemble in one 
of these states. The text can thus be retrieved measuring 
each element of the ensemble of quantum systems in the 
above basis. 

We are now going to state and explain the axioms. 

Axiom 1 - Distinguishability Every state 
that is not completely mixed relatively to a 
given refinement set is perfectly distinguish- 
able from another state in that set. 

This axiom is a stronger version of axiom distinguisha- 
bility used in . Recall that the notion of refinement 
set of a state coincides with the notion of face of a convex 
set. This axiom tells that if p is not completely mixed 
in a given set of states of a system F constituting a face, 
then there exist another state (j) in that face and a mea- 
surement A = {flp, a^} that can be performed on system 
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S such that (cp.^jp, </>) = 5p^^. To see this in a simple 
context consider again a die. Take a state of a die de- 
scribed by the statistical mixture of the face " one" (here 
we mean the physical face of the die that has one dot on 
it not the mathematical concept of convex analysis) and 
the face "six" with probabilities {poncPsix = (1 — Pone)}- 
This state is not completely mixed since there exist faces 
of the die that arc not used in the above mixture, e.g. the 
face with three dots that we denote face " three" . The ax- 
iom tells that face "three" can be used with the above 
mixture, {poncPsix — (1 — Pone)}, to store one bit of in- 
formation. Indeed one can choose to assign the logical 
value "0" to both faces "one" and "six" and the logical 
value "1" to face "three". This axiom holds classically. 
Recall that the set of states of a system with information 
capacity n is a simplex in R"~^. The refinement of a 
mixed state p that is not completely mixed, constitutes 
a simplex in M™ with m < n — I. Any pure state (j) not 
belonging to Fp is represented by a vector in M"^^ orthog- 
onal to the subspace in which is embedded Fp. Hence we 
have that (j) is perfectly distinguishable from all states 
in Fp hence also from p itself. The axiom holds also in 
quantum theory. The refinement set of a mixed state p of 
a system with information capacity n, is the convex hull 
of a CP™^^ where Fp has information capacity m < n. 
Any pure state (j) not belonging to Fp is represented by a 
complex vector orthogonal to the subspace representing 
Fp . Hence is perfectly distinguishable from all the pure 
states in Fp and thus from p itself. 

Axiom 2 - Conservation The information 
capacity of the refinement of any mixture of 
two states is less than or equal to the sum of 
the information capacities of the refinements 
of the two states composing the mixture, with 
equality if the states are perfectly distinguish- 
able. 

Let T] = pa+{l—p)p. The axiom tells that there cannot 
exist sets of states that are not perfectly distinguishable 
in Fo- and sets of states that arc not perfectly distinguish- 
able in Fp that become sets of perfectly distinguishable 
states if regarded as states of F,i . Moreover if p and <t are 
perfectly distinguishable states, then the information ca- 
pacity of F^i is the sum of the information capacities of Fp 
and F„ . Hence the number of pure states usable to store 
information in i^,, is the sum of the number of pure states 
usable to store information in F^ and the number of pure 
states usable to store information in Fp. To have an ex- 
ample consider two mixtures of different faces of a die, say 
a mixture of "one" and "six" {pono7Psix = (1— Pone)} and 
a mixture of "two" and "four" {ptwcPfour = (1 -ptwo)}- 
The axiom states that if we prepare the following mix- 
ture {gpono,gPsix,iptwo,iPfour} with q = 1 - t, then the 
number of values that can be used to store information in 
this mixture is not greater than four that is the number of 
values used in preparing the above two mixtures, namely, 
"one" ," six" ," two" ," four" . This axiom clearly holds in 
classical theory, a and p arc such that Fa- and Fp arc 



two simplexes with s and r verteces respectively and live 
in and R*""^. The refinement set of 77 = pp-|-(l— p)ct 
has dimension s + r — t — 1 where we assumed the intersec- 
tion of Fp and F^ to be a simplex in M*"^. is a simplex 
in R''+''^*^^ and has information capacity r + s — t where 
t > with equality iff p and a are perfectly distinguish- 
able. The axiom holds in quantum theory as well. Take 
a convex combination of two density matrices p and a. 
Fp and F^ are the convex hull of the space of rays in C 
and C respectively. is the convex hull of the rays be- 
longing to the smallest complex vector space containing 
both C and C, i.e. C''+*~* where C* is the intersection 
of C and C*. The information capacity of Fi, is thus 
r + s — t with t > with equality iff p and a perfectly 
distinguishable. 

Axiom 3 - Reversibility For any two pure 

states of a system, 0, ip, there exists a re- 
versible transformation ^ such that (j) = ^tp. 

The significance of the above axiom is simply that one 
can transform any pure state into any other with a re- 
versible transformation. This axiom clearly holds classi- 
cally since one can transform any pure state of a simplex 
in into any other applying a transformation in the 

symmetric group Sn (i.e. the group whose elements are 
the permutations of n objects with composition rule be- 
ing sequential composition of permutations). This axiom 
also holds in quantum theory since the set of pure states 
of a system with hilbert space C" is such that any pure 
state can be transformed into any other with a transfor- 
mation belonging to the group SU(n). 

Axiom 4 - Composition If dA and ca are 

the dimension and the information capacity 
of system A then 

dABC. ~ dAdsdc ■ ■ ■ 

and 

CABC. = CaCbCc ■ ■ ■ 

where ABC is the system composed of sys- 
tems A,B,C.. 

The first part of the above axiom is called local tomog- 
raphy. Every state of a system in a probabilistic theory 
can be represented as a list of a certain number of proba- 
bilities obtained performing an equal number of different 
measurements on the system in that state (see (^)). The 
number of measurements that are sufficient in order to 
uniquely determine a state as a vector of probabilities 
is the number of degrees of freedom of the system that 
is equal to its dimension by definition (see also pO|). 
The first part of axiom composition requires that the 
number of degrees of freedom of a composite system be 
the product of the number of degrees of freedom of the 
components. An important consequence of this axiom is 
that the real vector space in which it can be represented 
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the state space of a composite system Sab is the vec- 
tor space tensor product of the vector spaces in which 
are represented the state spaces of the component sys- 
tems, SA^ Sb- The second part states simply that for 
a system composed of a certain number of component 
systems, the information capacity of the composite sys- 
tem is the product of the information capacities of the 
components. To see that the above axiom holds classi- 
cally note that the state space of a composite system is 
a simplex in the real vector space tensor product of the 
vector spaces where live the components. Hence the first 
part of the axiom holds. Since the information capacity 
of a classical system is equal to the number of degrees of 
freedom of the system we have that also the second part 
holds classically. The axiom holds in quantum theory as 
well. Indeed a density matrix of a composite system pab 
is a hermitian operator defined on the space of hermitian 
operators tensor product of the spaces where live density 
matrices for the component systems A, B. These latter 
spaces are those where live operators acting on C"-* and 
C"^ respectively. Hence the first part of the axiom holds. 
The second part holds as well since the information ca- 
pacity of the state space of a quantum system coincides 
with the dimension of the complex vector space on which 
are defined density matrices describing states of the sys- 
tems. C"'* and C"^ are hilbert spaces of component sys- 
tems with information capacity ua and ns respectively, 
and C"'* ® C"^ is the hilbert space for the composite 
system with information capacity uatib- 



III. PROOF OF THE MAIN RESULT 
A. Many quantum theories 

In this section we show that a probabilistic theory sat- 
isfying axioms 1,2 either is classical, or is a probabilis- 
tic theory in which pure states are points of a projective 
space. The fact that classical probability theory satisfyies 
axioms 1,2 is already proved in the previous section. We 
thus are going to prove the remaining alternative, namely, 
pure states of a system with infromation capacity n-|- 1 of 
a probabilistic theory satisfying axioms 1,2 are points in 
a projective space of dimension n. From this fact it will 
follow that this class of theories is such that pure states 
are elements in a vector space defined over a generic field 
of numbers (more precisely, a generic field of numbers 
is called division ring ). This implies that for theories 
in this class it holds the superposition principle. Pure 
states of such theories can be represented by elements of 
a vector space in which all states in a subspace A are 
perfectly distinguishable from all states in a subspace B 
disjoint from A and linear combinations of elements in 
disjoint subspaces represent allowed pure states. 

A projective space of generic dimension n is defined in 
the following ||30| : 

Definition 9 Projective space of dimension n 



An arbitrary set S, together with a family of subsets, 
, that are called subspaces of dimension j , is a projec- 
tive space of dimension n if the following holds: 

(i) The only —1-dimensional subspace is the empty set. 

(ii) 0-dimensional subspaces of are the point subsets 
ofS. 

(iii) There is a unique subspace of dimension n, F" . 

(iv) If F^ and i^" are two subspaces of F^^ , and F^ is 
contained in F^ then = F'^ if and only if r ~ s. 

(v) Given two subspaces F^ and F'^ of F^ , if F^ is the 
intersection of F^ and F" then _F* is a subspace of F" . 

(vi) Given two subspaces F^ and i^" of F^ , if F^ = 
ij ps pt intersection of F^ and F" , then: 

s + r = t + m. 

In the following we will show that for pure states and 
for a family of subsets of these states of a system with 
information capacity n + 1 described by a probabilistic 
theory satisfying axioms 1,2, definition S holds. To prove 
that definition ^ holds for pure states of a system of a 
probabilistic theory we have to define the notion of sub- 
space in this context. In what follows we will denote S 
the set of states of a generic system. 

Definition 10 Given the set of states of a generic sys- 
tem S with information capacity n-\-l and any state p € S 
such that the cardinality of the largest maximal set of per- 
fectly distinguishable states in Fp is j -|- 1 < 1, the set 
Fp is called subspace of S with dimension j and denoted 

P3 
P 

Definition 11 The empty set is defined to be a sub- 
space of the set of states of a system S with dimension 
-1 and is denoted F~^ . 

Definition 12 Pure states in S are defined to be sub- 
spaces of dimension and denoted F^ . 

In the following we will suppose that S is the set of 
states of a generic system with information capacity n + 1 
of a generic probabilistic theory satisfying axioms 1,2. 
Lemmas and theorem ^ are needed to prove theorem 
^. containing the most important part of this section. 

Lemma 1 The only subspace of dimension n in S is S 
itself. 

Proof: Let Fg be a subspace of dimension n in S gener- 
ated by the refinement of a mixed state 9. We now prove 
the thesis showing that 6 is completely mixed in S. To 
see this suppose it is not so. Then 9 is not completely 
mixed. From axiom distinguishability 9 is perfectly dis- 
tinguishable from some state 4>. The state p9-\-{l—p)4>is 
in S and its refinement has information capacity greater 
than or equal to n+2 from axiom conservation. This con- 
tradicts the hypothesis that S has information capacity 
71 + 1 and proves the thesis. ■ 

Lemma 2 // F^ and are both subspaces of S and 
F'p C F§ then r < s. r = s iff Fj; = F§ 
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Proof: If Fp C then the information capacity of Fp 
cannot exceed that of F^. If F^ and F^ are the same 
subspace then clearly r = s. To prove the converse 
note that by hypothesis we have that for all (p £ FJ^ 
we must have (f> F^. Now suppose that r = s and 
Fp C F^. This means that p is not completely in Fa-. 
From axiom distinguishability p is perfectly distinsguish- 
able from some state in F^ . The refinement of the state 
u! = pp + (1 ~ p)4> must have information capacity equal 
to s + 2 by hypothesis and from axiom conservation. But 
this is absurd since u! € F§ and f^must have informa- 
tion capacity s + 1 by definition Hence if FJ^ C F^ 
and r = s then we must have Fp = F^. M 

Lemma 3 The refinement of a mixture of any two pure 
states of a system has information capacity two. 

Proof: From axiom conservation the refinement of a con- 
vex combination of any two pure states of a system can- 
not have information capacity greater than two. Since 
any of the two pure states is not completely mixed by def- 
inition, from axiom distinguishability we have the thesis. 



Definition 13 Given two subspaces of a system S, Fp 
and F^ , we denote their intersection as F^ A F§ 

Lemma 4 Given two subspaces ofS, F^ and F^, F^AF^ 
is a subspace ofS. 

Proof: The intersection of two convex sets is a convex set. 
Hence there exists a state t that is completely mixed in 
the set Fr = Fp /\Fa. The set of states of S in Fr consti- 
tutes a subspace of S with information capacity greater 
than or equal to two from axiom distinguishability. ■ 

Definition 14 Given two subspaces, Fp and F^ we de- 
fine their span and denote it as Fp V F§ the set of states 
in the refinement of a convex combination of p and a. 

Theorem 1 Given two distinct .subspaces of S, Fp and 
F§, if F,™ = Fj; V F§ and F^^ = F; A F^ with tj = pa + 
(1 — p)p and T completely mixed state in Fp A F^^, then 

r + s = t + m 

Proof: In the case both p and a are pure states the thesis 
holds from lemma ^ and definition pi] . 

Suppose that only one of them, say a, is mixed. Then 
either p is in F^ and then r = p or p is not in F^ and r 
is the empty set. In both these cases the thesis trivially 
holds. 

Suppose now that both p and a are mixed states. By 
hypothesis and axiom distinguishability t is perfectly dis- 
tinguishable from some state 0i in Fp that we choose 
w.l.o.g. pure. Let ri ~ pr + {I — p)4>i, < p < 1. 
From axiom conservation the information capacity of F^-^ 
is t + 2. Thus we have constructed the subspace F^^^. 
The state cti = pa + {1 — < p < 1, is such that 

Fcri has information capacity less than or equal to s -t- 1 



from axiom conservation. Moreover, from axiom distigu- 
ishability, there exists a pure state "01 S F^^ that is per- 
fectly distinsguishable from a and such that the state 
a'l = pa + (1 — has refinement Fg-j with information 
capacity equal to s -I- 1. Since by hypothesis a[ G F^-^ 
we must have that the information capacity of Fo-i is 
s + 1. In this way we have constructed the subspace 
-F'^^"'^. Now we can have either ti completely mixed in 
Fp or not. If the latter is the case then, in the same 
way we constructed F^^^ with F^ and (pi we construct 
with F^j^^ and a pure state 02 in Fp. In the same 
way as before we can also construct F^^^ with F*+^ and 
4>2. Iterating this procedure we will end for some finite 
A: to a state = pTk-i + (1 — p)4'k, < p < I that is 
completely mixed in Fp. This happens when there are 
no more states in Fp outside F^+*''. k is finite since, if it 
were not so, we would have infinite perfectly distinguish- 
able states in Fp contradicting the hypothesis (r is finite 
since n is). Iterating the above construction we also ob- 
tain the subspace F^^*^ where at = pak-i + (1 — p)(j)k^ 
< p < 1. By construction we have F^-^ C F^^. More- 
over F^^*^ contains every state in F^ and there cannot 
exist states in Fp outside F^+'^ since this would contra- 
dict the facts that F*+^ = F^' C F^+^ . But this in tm-n 
means that every state in F™ must pertain to FJ^*^ and 
thus CTfe is completely mixed in F^. All this implies that 
F^+'= = F; and F^+'' = F,^' thus fc = r-t and m = s + k. 
Thus we find m = s + r — t and prove the thesis. ■ 

Theorem 2 // the states of a system with information 
capacity n + 1 are described by a probabilistic theory sat- 
isfying axioms 1,2 then pure states of that system are 
points of a projective space of dimension n. 

Proof: For a system S with information capacity n + 1 
satisfying axioms 1,2 there exists a family of subspaces 
F^ of dimension j such that: 

(1 )-(2) (i)-(ii) in definition | hold by definitions |ll|, |l|. 

(3) (iii) in definition 9 holds by lemma 1 

(4) (iv) in definition 3 holds by lemma 2 

(5) (v) in definition y_holds by lemma ^ 

(6) (vi) in definition^ holds by theorem |l| ■ 
Projective spaces can be very wild geometrical objects. 

All spaces in which points arc one dimensional subspaces 
of a vector space defined over some field of numbers con- 
stitute projective spaces but not all projective spaces can 
be defined in this way. Examples of spaces in these class 
are the so called non-Desarguesian planes |3^, p3[ . 
Fortunately, the following theorem characterizes projec- 
tive spaces of dimension greater than two, and, as a con- 
sequence, the set of pure states of systems described by 
theories obeying axioms 1,2 with information capacity 
greater than three |3ll-|3S 



Theorem 3 Veblen and Young theorem 

// the dimension of a projective space is n > S then it is 
isomorphic to aKF", i.e. a projective space of dimension 
n over some division ring K. 
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The proof of this mathematical result is not in the scope 
of this paper. A projective space over a division ring K 
is defined as a space whose points are one dimensional 
subspaces of a vector space defined over the division ring 
K. A division ring (also called skew filed) is a mathemat- 
ical generalization of the concept of field of numbers (e.g. 
reals or complex numbers), in which multiplication is not 
needed to be commutative. The most popular example 
of skew field in which multiplication is not commutative 
are the quaternions. In the rest of the paper we will refer 
to a skew filed of numbers simply as a "field" for brevity. 

Theorem || characterizes the state space of systems 
with information capacity greater than or equal to four 
described by a generic probabilistic theory satisfying ax- 
ioms 1,2. We have thus proved that in a theory satisfy- 
ing axioms 1,2, systems with information capacity greater 
than three either are classical or are such that pure states 
information capacity n + 1, n > 2, are points of a KP". 
These can be seen as elements of a vector space defined 
over a generic skew field K. In this landscape a set of 
pure perfectly distinguishable states {(/)o, ^ii, 0n} of a 
system with information capacity n -I- 1 is a set of n -|- 1 
disjoint subspaces of the vector space in which are rep- 
resented pure states of the system. If a system can be 
in one of the states {^o, '/'i, •••7 then it can also be 
found in a superposition of these states 4>s = X]r=o ^i4>i 
with {ki} S K since this linear combination represents 
an element of the vector space and thus an allowed pure 
state. Hence in a probabilistic theory satisfying axioms 
1,2, systems with information capacity greater than three 
are either classical (since classical theory satisfies them) 
or a generalization of quantum systems in which super- 
position principle holds with coefficients not necessarily 
complex but belo nging to a generic (skew) field of num- 
bers K. In section [II C it will be proved that this charac- 
terization also holds for system with information capacity 
three or two. 



B. State space of an elementary system 

An elementary system is a system with information 
capacity two. We will denote the set of states of an ele- 
mentary system 52- We now show that all theories sat- 
isfying axioms 1-3 are such that the set of pure states in 
the refinement of a state with information capacity two, 
constitutes a sphere embedded in euclidean space of some 
dimension d2- If p is a completely mixed state in ^2 this 
result implies that the number of entries of a real vector 
representing a normalized state of an elementary system 
is (i2 4- 1 where the normalization degree of freedom is 
explicitly taken into account. In quantum theory 6,2 = i. 

We are now going to show that, given a state p, if Fp 
has information capacity two, then states in Fp constitute 
a sphere. 



Proof: From axiom reversibility, for any two pure states 
in Fp, tp, (p, there exists a transformation G such that 
(j) = Gip. The set of transformations mapping pure states 
in Fp into pure states in Fp forms a group. This group 
must be compact since vectors representing pure states in 
Fp constitute a real representation of it and have bounded 
entries. From this fact and the fact that any compact 
group admits a representation by means of orthogonal 
dxd matrices, we know that we can represent pure states 
in Fp as points of a d-dimensional sphere. We now prove 
that every point of the sphere represents a pure state. In 
order to prove it suppose it is not so. Then it exists a 
mixed state, cr, on the border of the convex set Fp that is 
not completely mixed. From axiom distinguishability, a 
must be perfectly distinguishable from some other pure 
state, 0, not in F^. F^ contains at least two perfectly 
distinguishable states from axiom distinguishability since 
a is by hypothesis mixed, and there must be a pure state 
in Fcj that is perfectly distinguishable from some other 
state in F^. The state uj = pa+{l—p)(f), Q < p < 1 is in Pp 
thus by hypothesis F^ cannot have information capacity 
larger than two. Indeed from axiom conservation, F^j 
has information capacity equal to the sum of one plus 
the information capacity of F^ thus having information 
capacity greater than two. Hence we find a contradiction 
proving that the set of pure states in Fp constitutes a 
sphere. H 

The above lemma permits to conclude that an elemen- 
tary system satisfying axioms 1-3 is either classical or is 
a generalized Bloch sphere in dimension d2- In the fol- 
lowing lemma we will explicit the Bloch representation 
for the generalized elementary system. 

Lemma 6 A point tp of the sphere constituting the state 
space of an elementary system has the following form: 



(2p{xi) 



\ 1 / 

where {xi}f^^ is a set of fiducial effects for 82- 

Proof: In the representation where ip is a, vector on the 
unit sphere in M''^ , the probability of seeing an effect (p 
in a measurement given that it is prepared state tp £ S2 
is: 



E^{^) = l/2{l + (P^^) 



(9) 



with (p representing some other unit vector on the sphere. 
The orthonormal basis for M"^^ is: 



(10) 





" 1 " 




" 0" 




" 









1 







Xi = 




, X2 = 




















1 



Lemma 5 Given a state p, if Fp has information capac- 
ity two, then the set of pure states in Fp is a sphere. 



It represents a set of fiducial effects for 52 since all effects 
can be represented as unit vectors on the sphere. For 
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any state ip G S2 we have that the probabihty for the 
i-th fiducial efi'ect is p{xi) = Ei{(j)) = (1 + (/)*)/2, hence 
(j/ = 2p{xi) — 1. This proves the thesis. ■ 



C. Quantum theory 

At this point of our derivation wc have not yet consid- 
ered composite systems. In this section we are going to 
show that the only non classical theory satisfying axioms 
1-3 and axiom composition is quantum theory. 

First we will use axiom composition to prove what in 
pO| , |2^ , p3| is called "axiom subspace" and what in ||2^ 
is derived from "axiom compression" (a sligh lty di fferent 
restatement of axiom subspace). In section IVA it will 



be discussed the significance of that axiom and will be 
pointed out that it may be regarded as a mathematical 
requirement on the state space of a physical system rather 
than a natural informational or operational constraint. 

The strategy to prove "axiom subspace" is to show that 
the field of numbers K on which is defined a projective 
space of a physical system satisfying axioms 1-4 docs not 
depend on the system considered but is a property of 
the theory. This will imply that the object describing 
a system with information capacity m -I- 1, a KP"*, is 
the same object describing a subspace with information 
capacity m+1 of a larger system described by a KP" with 
n > m. From this fact, any representation of the state 
space of the system with information capacity to -I- 1 can 
be equivalcntly considered a representation of a subspace 
with information capacity to -|- 1 of a larger system and 
this will prove axiom subspace in our derivation. 

We will now consider two projective spaces represent- 
ing two physical systems of a theory satisfying axioms 
1-4 and the composite system obtained composing them. 
Considering any of the subsystems of the composite sys- 
tem, a subspace of the composite system, we will show 
that all projective spaces describing systems in a theory 
satisfying axioms 1-4 must be defined over the same field 
of numbers. 

Definition 15 Two projective spaces Pi and P2 are iso- 
morphic ijf there exists a bijective map between Pi and 
P2 such that the points in a subspace of dimension n of 
Pi are mapped into points of a subspace of dimension n 
of P2 and conversely. 

In the following lemma we will consider two different 
projective spaces KP" and LP™ describing two systems 
of information capacity n+l and m -I- 1 respectively and 
the projective space describing the system obtained com- 
posing the two systems JP' that must have information 
capacity = (to -1-1) (n -1-1). Wc will then construct 
an isomorphism between a subspace of dimension n (m) 
of JJP' and KP" (LP"). 

Lemma 7 For any probabilistic theory satisfying axioms 
1-4, given two systems of information capacity m+1 
and n+l, and the composite system obtained composing 



them, there exists an isomorphism between any of the 
subsystems and a subspace of the composite system with 
the same information capacity. 

Proof: Suppose to have two systems K and L satisfy- 
ing axioms 1-4 of information capacity n+l and to H- 1 
respectively. According to theorem |^ their state spaces 
constitute two projective spaces of dimension n and to re- 
spectively defined over a skew field. Let K be represented 
by KP" and L by LP"* . Composing the two systems we 
obtain, from axioms 1-4, another projective space that 
can be represented in {dn + l){dm -|- 1) — 1 (excluding 
the normalization degree of freedom) real euclidean space 
and that wc denote as J = JP' with l + l = (TO+l)(n+l). 
Consider any state p G K and a fixed pure state (j) G L. 
The map obtained as: 



(11) 



Is a map between states in K and states in J. Wc now 
show that (j) is an isomorphism between K and a subspace 
of dimension n of J showing that KP" is isomorphic to 
a subspace of dimension n of JP' . To see this let p' be a 
completely mixed state in K. Fpi^^ is, by definition, a 
subspace of dimension n of J. The map </> is injective from 
K to Fp(^^ and, since </> is a fixed pure state, also surijec- 
tive. This means that we have a bijective map between 
pure states in K and pure states in Fpi^,p. Moreover, 
mapping any state in a subspace of dimension h < n of 
K results a state in a subspace of dimension h of Fpi^^. 
Thus wc have an isomorphism between a subspace of J 
with dimension n, JP", and the KP" describing K. Re- 
versing the roles of L and K in the above argument we 
have the thesis. ■ 

Theorem 4 The field of numbers on which are defined 
projective spaces representing pure states of systems in a 
theory satisfying axioms 1-4 is a property of the theory 
and does not depend on the system considered. 

Proof: Composing a system K described by KP" with 
a system L described by LP™, it results a composite 
system J described by JP' with l + l ~ (m + l)(n-|-l). 
From lemma 0, it exists an isomorphism between LP™ 
and a subspace of dimension m of J, JP™ and also an 
isomorphism between KP" and a subspace of dimension 
n of J, JP". From standard projective geometry, [ pO[ , 
this is possible only if JJ = K = L. Since this holds for 
systems of arbitrary finite information capacity we have 
the thesis. ■ 

We now turn to the last part of our derivation where 
we will show that in a theory satisfying axioms 1-4, the 
field of numbers K = C. 

From axiom composition we know that the system 
composed of two elementary systems has information ca- 
pacity four and we denote the set of states of the com- 
posed system 54. We also know that if ^2 + 1 is the 
number of parameters required to specify a state of an 
elementary system, (^2 -1-1)^ parameters will suffice to 
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specify a state of the system composed of two elemen- 
tary systems. 

In it is shown that if a theory satisfies axiom 
composition then the dimension of an elementary system 
must be odd. Since the group of transformations of an 
elementary system must be transitive on a sphere in M.'^'^ , 
it must coincide with one of the Lie groups whose action 
is transitive on a sphere in R''^ with odd d2. The possi- 
bilities are: SO{d2); the smallest exceptional Lie group 
usually denoted as Q2 , whose fundamental representation 
is a subgroup of SO{7). This observation can be found 
in We now rule out the last possibility in that list 
with the following: 

Lemma 8 G2 cannot be the group of transformations of 
an elementary system in a theory satisfying axioms 1-4-. 



means of orthogonal matrices is such that a pure nor- 
malized state ip is represented as follows: 



( 



2p(x,) - 1 



/3, = 2p(2/,) - 1 



V 



1 



7y = Ap{x^,yj) - 2p{xi) - 2p{yj) + 1 



(12) 



where {xi}f^^, {yj}'^^i are sets of fiducial effects for S2 , 
{ai}, {f3j} represent the marginal states ofip, {jij} repre- 
sents the correlation matrix and the last entry represents 
normalization. 



Proof: Q2 is the group of automorphisms of the algebra of 
octonions. If this were the group of transformations of an 
elementary system, then the state space of such a system 
would be a projective space of dimension 1 over octo- 
nions. Consider a system composed of two elementary 
systems of this kind. The state space of the composite 
system must form a projective space of dimension 3 from 
axiom composition and axioms 1,2. Considering one of 
the two subsystems as a subspace of the composite sys- 
tem, the 3-dimensional projective space should contain 
a one dimensional subspace which is an octonion one di- 
mensional projective space. But this would mean that 
the 3-dimensional projective space we are dealing with is 
an octonionic projective space and this contradicts the- 
orem ^ since octonions are not a skew filed (they lack 
associativity of multiplication). Hence we reach a con- 
tradiction and we prove the thesis. I 

The following proposition will be used in the proof of 
the subsequent lemma. 

Proposition 2 A separable state ip is pure iff the 
marginal states on both subsystems are pure. 

Proof: Let 0i and 02 be the marginal states of ijj on 
system A and B respectively. Suppose 0i , 02 pure and 
ijj mixed. Then ip = Pa4>'i (8) 02 + Pfe'/'i ® 4>2 ^^'^ the 
marginal state on any of the subsystems would be mixed. 
This proves one implication. Suppose by converse that 
ij) is pure and that 0i is mixed. Then we would have 
Tp = (pA0i + PB<t>i) ®<t>2= Pa4>'i <8) 02 + PS 01 ® 02 and 
Tp would be mixed. This proves the thesis. I 

From axiom reversibility we know that the set of trans- 
formations of 1S4 forms a compact group and any such 
group can be represented by means of orthogonal matri- 
ces. In the following lemma we will find such represen- 
tation. The argument used in the proof of the following 
lemma is invented in p^ . 

Lemma 9 The representation of pure states of in 
which the group of transformations is represented by 



Proof: A normalized state pj e 1S4 by axiom Local To- 
mography can be represented as in (^2|). By axiom Re- 
versibilty we know that every pure state in 54 can be 
expressed as ^0i (g) 02 with A matrix (not necessarily 
orthogonal) representing an element in the group of re- 
versible transformations of 1S4 and 0i (E) 02 pure state rep- 
resented as in (|l^). In the case A is product we already 
know that A is an orthogonal matrix by lemma |^. Since 
the group of reversible transformations in 1S4 is compact 
we know that it exists a matrix S such that S~^AS = O 
with O orthogonal and S e R('^2+2rf2) x 1^(^2+2^2) foj. ev- 
ery reversible transformation in ^4 (not only the product 
ones) . The matrix S is non singular since it must perform 
a change of basis from the representation ( p^ ) to the rep- 
resentation in which A is an orthogonal matrix. We will 
show that S is proportional to the identity, namely, those 
two representations are the same. If A is product then it 
is orthogonal; from this we have that if A is product then 
A = O'l ® O2 and SA = AS. Product transformations 
form a subgroup of the group of transformations of ^4. 
The following three subspaces: i) real span of the vectors 
(ai, ad2)'^', ii) real span of vectors (/3i, ...fid^Y^ \ iii) real 
span of matrices {7y }f are three invariant subspaces 
for the subgroup of product transformations. By Shur's 
lemma we thus have: 




(13) 



for some a,b, s > where Id is the identity d x d ma- 
trix. Now define 0" = 6* and 0^ = -9 with 9 € R'^^ and 
0'^^^ pure perfectly distinguishable states in ^2. Since 
product of two pure states is pure, from proposition ^ we 
have 0" (X) 0** is pure Va, 5 S {0, 1}. By axiom Reversibil- 
ity there exist transformations Gsw and Gcnot such that 
G,^0'^ 0*- = 0'' ® 0*^ and Gcnotcp" ® (p^ = (p"" ® 0^^®^ 
where © denotes sum modulo 2. This implies that 
G«„(0,O,O) = (0,0,0) while Ge«ot(O,O,00^) = (0,0,0) 
where the first and second entries in these vectors repre- 
sent vectors a and /3, i.e. the marginal states of the two 
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component systems, while the third entry represents ma- 
trix 7 containing the information regarding correlations 
of the component systems (see (|lj)). Rewriting these two 
expressions in the representation where transformations 
are orthogonal matrices we have: 



and 



SGsn,S~\ae, 0,0) = (0,661,0) 



SG,notS-\0,0,s9e^) = (0,60,0) 



From lemma || wc have H^jp = 1 and ||7<^o®,/,o|P = 
Tr[7jo«0o70o»0o] = 1 where 700^00 = 96^. Since SGS'^ 
is orthogonal for all G we have that {a9, 0, 0), (0, 0, s99^) 
and (0, b9, 0) have the same modulus hence a — s — b. 
This implies that S = al and proves the thesis. ■ 

Lemma 10 For every pure state £ S4 there exists an 
effect that gives probability one on it. 

Proof: The thesis holds for pure product states of 1S4 



from the results obtained in subsection III B and the fact 
that probabilities for product states and product effects 
factorize. Consider performing a transformation O on 
the composed system in a product state tppmd, then per- 
forming its inverse and, after that, making a mea- 
surement containing the effects giving probability 1 on 
"fAprod- O acts after state V'prod while acts before the 
effect giving probability 1 on V'prod thus transforming this 
effect. Expliciting this using the notation of (|), (|), we 
have: 

(V'prodlO^'OV'prod) = (?/'prodO-l|Ol^prod) = 1 (M) 

Where (?/'prod| represents the effect giving probability 1 
on state V'prod- By axiom reversibility, every state in 1S4 
can be written as OV'prod for some reversible transforma- 
tion matrix O. This implies the thesis. ■ 

Definition 16 Given a state "0 e 1S4, the effect giving 
probability one on ip involved in lemma ^ will be called 
the effect corresponding to ip- 

Lemma 11 Any state vector ijj S4 in the representa- 
tion of lemma^ is such that = 4. 

Proof: The thesis holds by lemma ^ for pure product 
states. By axiom Reversibility and the fact that in 
the representation of lemma ^ transformations are repre- 
sented as orthogonal matrices the thesis follows also for 
every ip € S4. ■ 

Lemma 12 In the representation of lemma^ given any 
state ijj, the effect corresponding to ip is represented by a 
vector proportional to that representing ip. 

Proof: For product states the thesis holds with the cor- 
responding effect being a product effect. This comes 



from lemma || and the fact that probabilities for prod- 
uct states and product effects factorize. In the repre- 
sentation of lemma since transformations are repre- 
sented as orthogonal matrices, we have: (V'prodO"^ | = 
1/4-0 Jj.qjO-'" (the factor 1/4 comes from normalization) 
while \Oipprod) = Oi/'prod and the probability is simply 
the scalar product of these two vectors. ■ 

We already know that a two dimensional subspace of 
the state space of ^4 is a representation of the state space 
of an elementary system. In the following lemma we will 
show that the set of effects corresponding to the pure 
states in a two dimensional subspace of 54 are the set of 
effects of an elementary system. We will thus prove the 
following: 

Lemma 13 The set of effects corresponding (in the 
sense of definition [7^ j to the pure states in a two di- 
mensional subspace of forms the same manifold as 
the effects associated to an elementary system. 

Proof: From lemma ^ any two dimensional subspace 
of 54 constitutes a sphere. From lemma ^ this is the same 
manifold formed by the state space of an elementary sys- 
tem 52, namely, a o?2-dimensional sphere with transitivity 
group 80(^2) with d2 odd. This is the case since both 
Fp and 1S2 are a projective space of dimension one over 
a given field of numbers K. Let T be the subset of the 
set of transformations of 1S4 that transforms pure states 
in Fp into pure states in Fp. Since the manifold repre- 
senting states in Fp is the same representing states in 
^2, the action of T on these state vectors represents the 
group of transformations of an elementary system. Since 
T represents such a group, we have that also the effects 
corresponding to the pure states in F^, by definition [l^, 
are represented by a (i2-dimensional sphere. This implies 
that states in Fp and the effects corresponding to them 
are an equivalent representation of the states and effects 
of an elementary system. ■ 

In the following lemmas we will use the following no- 
tation. 

• (e| represents the deterministic effect 

• S2{A^)Si indicates the marginal state of "0, namely 

or /3^. 

• s2{o.\ip)si indicates the not normalized state of 1S2 
obtained measuring an effect (a| on one component 
system of the two bits system in state V' & S^. 

• 52(^1 ® 52("IIV')54 ~ (a&lV') where (a|, (6| are two 
effects in 52. 

Lemma 14 Given (0|,(1| two perfectly discriminating 
effects in S2, the deterministic effect for the system com- 
posed of two bits is: 



5,(e| = (00| + (ll| + (10| + (01| 
where {ij\ = {i\ ® i,j = {0, 1} 



(15) 
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Proof: The effect in ( [l5| ) represents the coarse graining 
of the outcomes obtained in a measurement for the com- 
posite system. ■ 

Lemma 15 Given (0|,(1| two perfectly discriminating 
effects in S2, there exists a state ijjent G S4 such that: 



(OOl^ent) = (lll^cnt) = 1/2 



(16) 



where {ii\ = {i\ (X) {i\, i = 0,1 



Proof: If (0|, (1| arc two perfectly discriminating effects 
there will be two pure perfectly distinguishable states 
|0), |1) S 1S2 that are perfectly discriminated by them. 
Let p = 1/2(|00) + |11)) be a state of ^4. Fp is a two 
dimensional subspace of 54. From lemma [ij the set of 
states in Fp and the set of effects corresponding to them 
are represeted by the same manifolds of the correspond- 
ing sets of an elementary system. If we represent these 
two sets in real euclidean space, then they represent one 
the dual of the other as in the case of an elementary 
system. It then must exist a state V'ont £ Fp with the 
claimed property. This is represented by a state in the 
equator of the sphere representing pure states in Fp. M 

Lemma 16 Using the representation of lemma |^ /i 

■tpent we have av,„„t = /3v.o„t = and 



or 



1 

c 



(17) 



where C is a do ^ 1 dimensional real matrix. 



Proof: a^^^j and are representations of the two 

marginal states (6|V'cnt)s4 on the two component sys- 
tems of 54. |0), |1) are the two perfectly distinguishable 
states of lemma ^ In the representation of lemma 
these are represented by two antipodal vectors </)°,0^ on 
the unit sphere in . (0|, (1| are the two corresponding 
discriminating effects. We know that: 

52(e|V'cnt)54 = 52(O|V'ont)54+52(l|'0ont)54 ^ P\x) + {'^-p)\x' ) 

(18) 

where p\x), (1 — p)\x') are two not normalized states in 
^2. The state p = 1/2(|0)|0) 4- |1)|1)) is perfectly dis- 
tinguishable from |1)|0) with the measurement {(0|(0| + 
(1|(1|, (1|(0| + (0|(1|} where (0|(0| + (1|(1| represents the 
coarse graining of the corresponding outcomes. The state 
V'ent G Fp is perfectly distinguishable from |1)|0) with the 
same measurement. This implies that (iKOliAont) = 0. 
This in turns means that p(l|a;) = that is true iff 
(l|a;) = 0. \x) is a pure normalized state in 1S2 that gives 
probability on the effect (1| and the unique normal- 
ized state in 1S2 with this property is |0). This implies 
|a;) = |0). With the same argument we can conclude 
that (Olx') = and thus \x') = |1). Since by hypothe- 
sis (0|(0|?/'cnt) = (l|(l|V'cnt) = 1/2 we must havep = 1/2. 
Hence wc have 52 (e|'(/'ont)54 — 1/2(|0) + 11)). This implies 



that a 



since |0) and |1) arc antipodal vcc- 



loss of generality we can choose the vector representing 
|0) equal to xi where xi is defined in (10) and |1) is 
the antipode of xi. This is due to the arbitrarity of the 
choice of |0), |1) in lemma [l5|. Since from this lemma 'i/'ont 
is such that: 



(OOlV'cnt) + (ll|?/'c„t) - 1 



(19) 



we conclude that 7^,^^^ = 1 where 7^"* ^ is the ij-th entry 
of the matrix 70^„t . From the fact that a^^^^ = /3,/,^„t = 
the action of an element of the group of product trans- 
formations on V'cnt is: 



Oi (g) 02'4'cnt = Ol7v,„tOj 



(20) 



with Oi orthogonal matrix in S0(d2) and 7,/j„„t ^2 di- 
mensional real matrix. Suppose now that the vector 
■f'^^ont J'fei ^^'^^ that it exists 7^"^^^^^ 7^ for some i 
Then, since the group of transformations of 52 is tran- 
sitive on the sphere, it would exist an Oi such that the 
vector Oi ^ ■ • • ^Ip '^J'^ would have an entry with mod- 
ulus greater than 1. Such entry would pertain to a state 
vector representing a pure state in ^4. The representa- 
tion we are working with, is by hypothesis the one found 
in lemma ^; it is clear from ( p^ ) that there cannot exist 
entries of state vectors with modulus greater than one in 
this representation. This implies that ^ = for all 

i ^ I. Repeating the argument for the vector {7^* t^i=i 
and a transformation we also find 7^* ^ = for all 

i=^i. m 

We now have to show that the dimensionality of the 
state space of an elementary system in a theory satisfying 
axioms 1-4 is three. In order to do so we will use a 
strategy inspired to that invented in p^ . 

We introduce the following state: 



(21) 



/ is the identity in 1S2 while W is the matrix having en- 



R 



kl 



if k ^ I, Rl^ = 1 if k ^ l,i, i?; 



fefe 



-1 



l,i. We know that -^'^t is a state in ^4 for all i 



tries: 
if k = 

since, for all i, i?* is in SO{d2) that coincides with the 
group of reversible transformations in 52- 

Lemma 17 ■i/'ont ^'^'^ V-'cnt 0,1"^ such that ("i/'ont I V'ent) ~ 

Proof: From lemma ^wc know that: 



(00| + (ll| + (01| + (10| = (e| 



(22) 



where {Q\ = xi, xi is defined in (^ and (1| is represented 
by the vector antipodal to (0| in the (i2-dimensional 
sphere representing effetcs of 1S2 . This choice is the same 
done in lemma 16, From lemma 13 we also have that: 



(e|l/ont) = (O0+ll|l/ont) = 1 

that implies (lOji/ont) = (Olli/ont) = 0. Now also note 
that: 



tors in the representation we are working with. Without (00 + ll|(i?' ® I){R^ ® I)i^cnt) ~ (00 + lljV'ont) = 1 
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and since: 

(00 + ll|(i?''"' ® I)iR' ® /)7/.ent) - (10 + 01|V4it) = 1 

it follows that {00\iplnt) = (lllV'int) = 0- This implies 
that the test {(00| + (11|,(01| + (10|} is perfectly dis- 
criminating for V'ont ! V'cnt hence the two states are per- 
fectly distinguishable. From lemma ^ the set of states 
and the set of effects of Fp have the same representa- 
tion as geometrical object in real euclidean space as the 
corresponding sets of an elementary system. This means 
that ipl^ji , ipcnt are two antipodal points of the sphere thus 

ireJ^ent) = 0. ■ 

Theorem 5 The dimension of the state space of an ele- 
mentary system is d2 = 3. 

Proof: Let p = 1/2 (1/'^,^^ -l-_!^nt)- Fp has information 
capacity two. From lemma |lj and lemma ^ we know 
that 

(V'c„t|V'cnt)=0 (23) 

We will now explicit (|3|) using the representation of 
lemma |^. First we change notation for the matrix 7^^,,^ : 

li>^nt = (7vi„„ti' 7^„„t2' ■ ■ • ' 7i/i„„td2) (24) 

where is the i-th column vector of the matrix 70^„t 

Note that by lemmas || and |l^ we must have 

Il7;„„all' = l (25) 

and 

Ell7;.„..ll' = 3. (26) 
e=i 

( p3| ) can be written as 

= 1/4(1 + ^^r) = 1/4(1 + Tr[7^^„^7^._^J) = 

1/4(1 + X:il75.„,.lP-2||75.„,.f-2|l75„,iin 

61=1 

(27) 

This, dH), (H) imply that: 

\hl.J\' = \hl.J\' = ^ (28) 

Since by definition i ~ 2, ..^2 we must have that 
1 1 7^ tilP = 1 loi' all From ( |2^ ) the thesis follows. 
■ 

Corollary 1 An elementary system of a theory satisfy- 
ing axioms 1-4 is a projective space of dimension 1 over 
complex numbers, i.e. a CP^. 

Proof: From lemma ^ an elementary system must be 
a unit d2 dimensional sphere in W^'^ . From theorem |^ 
^2 = 3 and 50(3) is the group of reversible transforma- 
tions transitive on a sphere in R^. It is a mathematical 
fact that a CP^ is a manifold isomorphic to the or- 
dinary sphere in three dimensional euclidean space and 
with transitivity group 5*0(3). I 



Corollary 2 Pure states of a system with information 
capacity n+1 described by a probabilistic theory satisfying 
axioms 1-4 are points of a CP". 

Proof: From theorem || the state space of a system with 
information capacity 71 + 1 must be a projective space of 
dimension n over some field K. From lemma |^ and corol- 
lary |l|, every one dimensional subspace of this projective 
space must be a CP^. This implies the thesis. ■ 



IV. DISCUSSION 
A. The subspace axiom 

In ||2^, |2^, it was assumed what we here generi- 
cally call "subspace axiom" . In it was assumed axiom 
compression stating more or less the same thing as axiom 
subspace in a way consistent with the graphical formal- 
ism invented by the same authors. The requirement in 
these axioms is that it must exist a linear mapping be- 
tween a subspace of a system and the state space of a 
different system with information capacity equal to that 
subspace such that the two spaces (i.e. the state space 
of one system and the subspace of the other) can be re- 
garded as different representations of the same mathe- 
matical object. The intuitive justification for this axiom 
is that the mathematical object used to represent a set 
of states depends only on the information capacity of 
that set. This is clearly a mathematical requirement on 
the state space of a physical system and does not deal 
with information theory. We see from theorem ^ that 
this property is derived by the projective space struc- 
ture of quantum theory. This structure, in turn, is de- 
rived from axiom distinguishability and axiom conserva- 
tion that deal with storage and retrieving of information 
into a physical system (see section ^ and do not refer 
to any mathematical property that the state space of a 
physical system must have. In p6| it has been formu- 
lated a new axiom sturdiness to prove axiom subspace in 
the duotensor framework. Despite its "operational" for- 
mulation, sturdiness strongly relies on the notion of filter 
as used in quantum mechanics. This axiom can thus be 
hardly regarded as an information theoretic constraint on 
a probabilistic theory and in the context of the present 
reconstruction it would sound strange. 

Requiring axioms like the subspace one in a deriva- 
tion of quantum theory from informational/operational 
requirements means to require a priori much of the math- 
ematical structure of the theory wihtout giving a justifi- 
cation in terms of more basic principles. In the present 
reconstruction we are able to get rid of axiom subspace in 
favor of two simpler axioms related to storing and retriev- 
ing of information into physical systems. It is remark- 
able that deriving an important piece of mathematical 
structure of quantum theory from more basic principles 
permits us to classify probabilistic models different from 
quantum theory that share with quantum theory only 
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some of the natural requirements wc imposed in section 
H but not others. 



B. Complex amplitudes and composite Systems 



It is shown in section [II A that if a probabilistic the- 
ory satisfies axioms 1-5 then either is classical or it con- 
stitutes a generalization of quantum theory in which the 
superposition principle holds with amplitudes not neces- 
sarily complex but belonging to a generic field of num- 
bers. Among these theories we find quantum theory but 
also quantum theory over reals and quantum theory over 
quaternions corresponding respectively to amplitudes in 
superpositions being reals and quaternions. The result in 
corollary |^ implies that among the theories in this class, 
using complex amplitudes in superpositions is equivalent 
to require axiom composition for the description of com- 
posite systems. 

It is shown in that quantum theory over reals sat- 
isfies 2-Local Tomography for the composite system. 

Definition 17 A theory satisfies 2-Local Tomography if 
the state of a composite system (ABC.) can be deter- 
mined from the statistics of measurements on the single 
components (A, B, C,..) and the statistics of bipartite 
measurements on two components at one time (AB, AC, 
BC, ...). 

2-local tomography differs from local tomography be- 
cause in the latter the statistics of measurements on the 
single components suffice to determine the state. The re- 
sult of ||2^ combined with the present derivation of quan- 
tum theory is very interesting. We have infact proved 
that quantum theory over reals is, with quantum the- 
ory, in the class of probabilistic theories satisfying ax- 
ioms 1-5. On the other hand quantum theory is the only 
theory in this class satisfying axiom composition with lo- 
cal tomography. In is shown that quantum theory 
over reals satisfies 2-local tomography in place of local to- 
mography. This implies that quantum theory over reals 
satisfies the same informational axioms of quantum the- 
ory except from the one concerning composite systems in 
which local tomography must be substituted with 2-local 
tomography. We are now in a position to state a conjec- 
ture similar to the one stated in |2^: quantum theory 
over reals is the only probabilistic theory satisfying the 
following list of axioms: system, distinguishability, ca- 
pacity, conservation, reversibility and composition with 
2-local tomography. 

It is very natural that our description of the physical 
world involves local tomography in spite of 2-local to- 
mography for describing state of a composite systems. 



Indeed if the latter were used then the mere fact that 
two different physical systems are described at once as a 
single system would imply for the composite system to 
have more degrees of freedom than the cartesian product 
of the degrees of freedom of the components. 

V. CONCLUSION 

The mathematical rules governing quantum theory can 
be understood in terms of information theoretic princi- 
ples. A physical system is thought as something in which 
can be stored information. The maximal amount of infor- 
mation that can be stored in a given system has a definite 
value. Probabilistic theories are a very general framework 
in which physical systems can be described in terms of 
operational primitives such as preparations, transforma- 
tions and measurement outcomes. Five natural axioms 
for physical systems in the probabilistic theories frame- 
work are Distinguishability, Conservation, Reversibility, 
Composition. Theories satisfying the first two axioms 
are either classical or such that pure states of a physi- 
cal system are points of a projective space defined over 
a generic field of numbers. This latter class of theories 
constitutes a generalization of quantum theory in which 
superposition principle holds with amplitudes not neces- 
sarily complex but belonging to a generic field of num- 
bers. Examples of theories in this class are quantum the- 
ory, real quantum theory, quaternionic quantum theory. 
This result establishes a connection between two different 
approaches to quantum foundations, quantum logic and 
the one based on information theoretic primitives. Such 
connection is provided by theorem |^. More importantly, 
this result permits to get rid of the subspace axiom in the 
reconstruction. Axiom subspace was always more or less 
explicitly used in previous reconstructions and contains a 
mathematical constraint on the state space of a physical 
system. They are also classified all the possible proba- 
bilistic theories that are consistent with three subsets of 
the set of axioms given and this could be useful in ex- 
perimental tests of quantum mechanics to check whether 
the results of these tests could be consistent with other 
mathematical models. 
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